Dynamic Obstacle Avoidance with PEARL: PrEference Appraisal Reinforcement Learning
نویسندگان
چکیده
Manual derivation of optimal robot motions for task completion is difficult, especially when a robot is required to balance its actions between opposing preferences. One solution has been to automatically learn near optimal motions with Reinforcement Learning (RL). This has been successful for several tasks including swing-free UAV flight, table tennis, and autonomous driving. However, high-dimensional problems remain a challenge. We address this dimensionality constraint with PrEference Appraisal Reinforcement Learning (PEARL), which solves tasks with opposing preferences for acceleration controlled robots. PEARL projects the high dimensional continuous robot state space to a low dimensional preference feature space resulting in efficient and adaptable planning. We demonstrate that on a dynamic obstacle avoidance robotic task, a single learning on a much simpler problem performs realtime decision-making for significantly larger, high dimensional problems working in unbounded continuous states and actions. We trained the agent with 4 static obstacles, while the trained agent avoids up to 900 dynamic obstacles in a highly constrained space. We compare these tasks to traditional, often manually tuned solutions for these high-dimensional problems.
منابع مشابه
Dynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملAvoidance of Multiple Dynamic Obstacles
this article is a continuation of the previous article called “Obstacle Avoidance in Dynamic Environment: a Hierarchical Solution”, which presented the concept for obstacle avoidance in dynamic environment suitable for mobile robot. The task of obstacle avoidance is divided in three principal groups: local, global and for emergencies. The global avoidance is here approached, in which the concep...
متن کاملImproving Learning for Embodied Agents in Dynamic Environments by State Factorisation
A new reinforcement learning algorithm designed specifically for robots and embodied systems is described. Conventional reinforcement learning methods intended for learning general tasks suffer from a number of disadvantages in this domain including slow learning speed, an inability to generalise between states, reduced performance in dynamic environments, and a lack of scalability. Factor-Q, t...
متن کاملTime Variable Reinforcement Learning and Reinforcement Function Design
We introduce the mathematical model for time variable reinforcement learning. The policy, the rewards or reinforcement function and the transition probabilities may depend on the progress of the time t. We prove that under certain conditions slightly changed methods of classical dynamic programming assure finding the optimal policy. For that we deduct the Bellman equation for the time variable ...
متن کاملReactive Vision-Based Navigation Controller for Autonomous Mobile Agents
Initial results of an ongoing research in the field of reactive mobile autonomy are presented. The aim is to create a reactive obstacle avoidance method for mobile agent operating in dynamic, unstructured, and unpredictable environment. The method is inspired by the stimulus-response behavior of simple animals. An obstacle avoidance controller is developed that uses raw visual information of th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016